Clustering ; Single Linkage ; and Pairwise Distance Concentration

نویسنده

  • Sham Kakade
چکیده

But what about high dimensions? What is the density of the points near the mean? And how far away is the average point from it’s component mean? Let us address this questions for a single isotropic Gaussian distribution. First, note that E[‖x‖] = nσ. Hence, on average, we expect a point to be rather far from mean, but let us quantify this. Recall, that the distribution of ‖x‖ is a χn distribution with n degrees of freedom. Hence, the variance of ‖x‖ is 2nσ (so the deviation if √ 2nσ). Hence, we expect the average distance to the mean to be √ nσ ± O(nσ); so not only do we expect the points to be far away, they also will be quite far away with low variance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Choosing the Best Hierarchical Clustering Technique Based on Principal Components Analysis for Suspended Sediment Load Estimation

1- INTRODUCTION The assessment of watershed sediment load is necessary for controling soil erosion and reducing the potential of sediment production. Different estimates of sediment amounts along with the lack of long-term measurements limits the accessibility to reliable data series of erosion rate and sediment yield. Therefore, the observed data of suspended sediment load could be used to ...

متن کامل

Genie: A new, fast, and outlier-resistant hierarchical clustering algorithm

The time needed to apply a hierarchical clustering algorithm is most often dominated by the number of computations of a pairwise dissimilarity measure. Such a constraint, for larger data sets, puts at a disadvantage the use of all the classical linkage criteria but the single linkage one. However, it is known that the single linkage clustering algorithm is very sensitive to outliers, produces h...

متن کامل

An Experimental Survey on Single Linkage Clustering

Clusters are useful to identify required object from the huge amount of datasets. There are lots of clustering methods, used to create clusters. Single linkage clustering method is an example of hierarchical agglomerative clustering which is used to merge objects in a cluster, based on minimum distance. In this paper we performed an experiment on two dimensional spaces where multiple objects ar...

متن کامل

Analysis of Cluster Formation Techniques for Multi-robot Task Allocation Using Sequential Single-Cluster Auctions

Recent research has shown the benefits of using K -means clustering in task allocation to robots. However, there is little evaluation of other clustering techniques. In this paper we compare K -means clustering to single-linkage clustering and consider the effects of straight line and true path distance metrics in cluster formation. Our empirical results show single-linkage clustering with a tr...

متن کامل

Clustering Molecular Dynamics Trajectories: 1. Characterizing the Performance of Different Clustering Algorithms.

Molecular dynamics simulation methods produce trajectories of atomic positions (and optionally velocities and energies) as a function of time and provide a representation of the sampling of a given molecule's energetically accessible conformational ensemble. As simulations on the 10-100 ns time scale become routine, with sampled configurations stored on the picosecond time scale, such trajector...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010